COFFIN: A Computational Framework for Linear SVMs
نویسندگان
چکیده
In a variety of applications, kernel machines such as Support Vector Machines (SVMs) have been used with great success often delivering stateof-the-art results. Using the kernel trick, they work on several domains and even enable heterogeneous data fusion by concatenating feature spaces or multiple kernel learning. Unfortunately, they are not suited for truly large-scale applications since they suffer from the curse of supporting vectors, i.e., the speed of applying SVMs decays linearly with the number of support vectors. In this paper we develop COFFIN — a new training strategy for linear SVMs that effectively allows the use of on demand computed kernel feature spaces and virtual examples in the primal. With linear training and prediction effort this framework leverages SVM applications to truly large-scale problems: As an example, we train SVMs for human splice site recognition involving 50 million examples and sophisticated string kernels. Additionally, we learn an SVM based gender detector on 5 million examples on low-tech hardware and achieve beyond the stateof-the-art accuracies on both tasks. Source code, data sets and scripts are freely available from http://sonnenburgs.de/soeren/coffin.
منابع مشابه
Efficient Training of Graph-Regularized Multitask SVMs
We present an optimization framework for graph-regularized multi-task SVMs based on the primal formulation of the problem. Previous approaches employ a so-called multi-task kernel (MTK) and thus are inapplicable when the numbers of training examples n is large (typically n < 20, 000, even for just a few tasks). In this paper, we present a primal optimization criterion, allowing for general loss...
متن کاملGenetic Programming for Kernel-Based Learning with Co-evolving Subsets Selection
Support Vector Machines (SVMs) are well-established Machine Learning (ML) algorithms. They rely on the fact that i) linear learning can be formalized as a well-posed optimization problem; ii) non-linear learning can be brought into linear learning thanks to the kernel trick and the mapping of the initial search space onto an high dimensional feature space. The kernel is designed by the ML exper...
متن کاملAPPLICATION OF KRIGING METHOD IN SURROGATE MANAGEMENT FRAMEWORK FOR OPTIMIZATION PROBLEMS
In this paper, Kriging has been chosen as the method for surrogate construction. The basic idea behind Kriging is to use a weighted linear combination of known function values to predict a function value at a place where it is not known. Kriging attempts to determine the best combination of weights in order to minimize the error in the estimated function value. Because the actual function value...
متن کاملAn Efficient Classifier Based on Hierarchical Mixing Linear Support Vector Machines
Support vector machines (SVMs) play a very dominant role in data classification due to their good generalization performance. However, they suffer from the high computational complexity in the classification phase when there are a considerable number of support vectors (SVs). Then it is desirable to design efficient algorithms in the classification phase to deal with the datasets of realtime pa...
متن کاملScalable, accurate image annotation with joint SVMs and output kernels
This paper studies how joint training of multiple support vector machines (SVMs) can improve the effectiveness and efficiency of automatic image annotation. We cast image annotation as an output-related multi-task learning framework, with the prediction of each tag’s presence as one individual task. Evidently, these tasks are related via dependencies between tags. The proposed joint learning fr...
متن کامل